Document Re-ranking by Generality in Bio-medical Information Retrieval
نویسندگان
چکیده
Document ranking is well known to be a crucial process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. Traditional document ranking methods are based on different measurements of similarity between documents and query. Due to information explosion and the popularity of WWW information retrieval, the increased variety of information and users makes it insufficient to consider similarity alone in the ranking process. In some cases, there is a need for user to retrieve documents which are generally or broadly describing a certain topic. This is particularly the case in some specific domains such as bio-medical IR. To satisfy the stringent requirement of generality based retrieval, we propose a novel approach to re-rank the retrieved documents by considering their generality as a compliment. By analyzing the semantic cohesion of text, document generality can be quantified. The retrieved documents are then re-ranked by their combined scores of similarity and the closeness of documents’ generality to the query’s. Results show an encouraging performance on a large scale bio-medical text corpus, OHSUMED, which is a subset of MEDLINE collection containing 348,566 medical journal references and 101 test queries.
منابع مشابه
Document generality: its computation for ranking
The increased variety of information makes it critical to retrieve documents which are not only relevant but also broad enough to cover as many different aspects of a certain topic as possible. The increased variety of users also makes it critical to retrieve documents that are jargon free and easy-to-understand rather than the specific technical materials. In this paper, we propose a new conce...
متن کاملInvestigating the Impact of Authors’ Rank in Bibliographic Networks on Expertise Retrieval
Background and Aim: this research investigates the impact of authors’ rank in Bibliographic networks on document-centered model of Expertise Retrieval. Its purpose is to find out what kind of authors’ ranking in bibliographic networks can improve the performance of document-centered model. Methodology: Current research is an experimental one. To operationalize research goals, a new test colle...
متن کاملEfficient Algorithm for Mining on Bio Medical Data for Ranking the Web Pages
Information in the internet is evolving in terms of high volume through different sources. Extracting tuples from HTML pages has been an important issue in various web applications such as web data integration, e-commerce market monitoring, and mash ups that repurpose and selectively combine existing web data services. Data Mining is the process of analyzing data from different perspectives and...
متن کاملEfficient Algorithm for Mining on Bio Medical Data for Ranking the Web Pages
Information in the internet is evolving in terms of high volume through different sources. Extracting tuples from HTML pages has been an important issue in various web applications such as web data integration, e-commerce market monitoring, and mash ups that repurpose and selectively combine existing web data services. Data Mining is the process of analyzing data from different perspectives and...
متن کاملCannabis_TREATS_cancer: Incorporating Fine-Grained Ontological Relations in Medical Document Ranking
The previous work has justified the assumption that document ranking can be improved by further considering the coarse-grained relations in various linguistic levels (e.g., lexical, syntactical and semantic). To the best of our knowledge, little work is reported to incorporate the fine-grained ontological relations (e.g., ) in document ranking. Two contributions are wo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005